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In his comment on Lindzen et al. (2001), Harrison (2001) found that the amount of high- 
level clouds, A, and the sea-surface temperature beneath clouds, T, averaged over a large oceanic 
domain in the western Pacific have secular linear trends of opposite signs over a period of 20 months. 
He found that when the linear trends are subtracted from the data, the correlation between the 
residual and T is much reduced. His estimates of the confidence levels for the correlation indicate, 
moreover, that this correlation is not statistically significant. 

The domain-averaged A and, to a lesser degree, T, have distinct intra-seasonal and seasonal 
variations. These variations are influenced by the large-scale wind and temperature distributions 
and by the seasonal variation of insolation. To separate the local effect from the effect of slowly 
changing large-scale conditions, rather than subtracting 20-month linear trends from the senes, 
which has the potential to spuriously extrapolate intra-seasonal and seasonal variations to even 
longer time scales, we subtracted 30-day running means of A and T from each time series; in effect, 
the data were high-pass filtered. The number of points (days), N, is reduced by this process from 
the original value of 510 to 480. 

Figures 1 and 2 show the scatter plots relating A to T without and with, respectively, removing 
the running means. When the running means are removed, the amount of scatter is reduced and 
the correlation coefficient, R, changes from -0.301 to -0.378. The slope B m the linear regression 

of A against T remains nearly unchanged, -0.022. 

Given the number of samples from which it is calculated, it is reasonable to treat the correlation 
coefficient R as approximately normally distributed. Under the “null hypothesis” that there is no 
real correlation between A and T, sample estimates of R are expected to have mean zero and 
standard deviation cr(R) given approximately by 


where df is the “effective number of degrees of freedom,” which would be just the number of samples 
N if time correlations in A and T were negligible and the number of samples is large. Given (1), 
in order to say, for instance, that R is significantly different from zero at the 5% significance level, 
we would require \R\ > 2a{R) (the “2-cr level”). For a more thorough discussion of this topic and 
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of modifications needed when the number of samples is not so large, see, for example, von Storch 
and Zwiers (1999), pp. 148-149. 

The number of samples N in the time series for >1 and T substantially overestimates df in 
(1) because of the time correlations in the series. There are, however, methods of getting better 
estimates of df. The correlation coefficient R is determined by the average product A'T, where A 
and r are deviations from their respective 30-day running means. If the lagged autocorrelation of 
A'V can be approximated by an exponentially decreasing function of time separation, then df can 
be estimated using the methods of Leith (1973) and Jones (1975) as 


df^N 


1 — Oia'T' 


( 2 ) 


1 + CtA'T' 

where oa'T' is the lag-1 autocorrelation coefficient of the time series A'V. It is computed from 
our data to be 0.649. With N = 480 Eqs. (2) and (1) give 

df = 102 
a{R) = 0.099. 

Using (1) to estimate the 95% confidence limits for R as ±2a(i?), which is a valid approximation 
as long as i? is not too large (and overestimates, in any case), we find from the filtered data 


R = -0.38 ± 0.20 (95% confidence limits). 


( 3 ) 


Although a number of assumptions and approximations have been used to obtain the estimate (3), 
the correlation lies sufficiently far outside the estimated confidence limits that it is reasonable to 
treat it as statistically robust. We note in passing that virtually the same correlation was found 
when individual monthly means were subtracted from A and T instead of running means. 

The slope B of the linear regression of A' against V can be written in terms of the correlation 

R as 

. (4) 


B = R 


CT' 


where cta' and ar- are the standard deviations of the series. Given the large number of samples m 
the time series, Eq. (4) implies that the 2-a significance level of B is essentially the same as that 

ofR. 
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We have also estimated the significance of the correlation between T and the ratio of anvil 

clouds to convection core as shown in Fig. 5d of Lindzen et al. (2001). The results are 

df = 119 

a{R) = 0.092. 

The correlation and its confidence limits are estimated to be 

R = 0.50 ± 0.18 (95% confidence limits). (^) 

Again, the correlation appears to be highly significant. 

A more detailed response to the criticisms of Harrison (2001) may be found at the Allen Press 

BAMS web archive. 

Summary 

Harrison’s (2001) Comment on the methodology in Lindzen et al. (2001) has prompted re- 
examination of several aspects of the study. Probably the most significant disagreement in our 
conclusions is due to our different approaches to minimizing the influence of long-time-scale vari- 
ations in the variables A and T on the results. Given the strength of the annual cycle and the 
20-month period covered by the data, we believe that removing monthly means is a better approach 
to minimizing the long-time-scale behavior of the data than removal of a linear trend, which might 
actually add spurious long-time scale variability into the modified data. We have also indicated 
how our statistical methods of establishing statistical significance differ. More definitive conclusions 
may only be possible after more data have been analyzed, but we feel that our results are robust 
enough to encourage further study of this phenomenon. 
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Figure Captions 


Fig. 1 . Scatterplot showing relation between the high-level cloud amount A and the 
cloud-weighted sea surface temperature T. The line is the linear regression, and R is the 
correlation coefficient. Each data point represents daily and domain averaged values. 

Fig. 2. Same as Figure 1, except the monthly running means are removed from A and 
T. 
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